Model Based Estimation of Covariance Matrices with Applications to the Em-algorithm

نویسنده

  • Stephen M. Woodruff
چکیده

When the minimization of mean square error (or variance) is a primary criterion for chosing an estimator of means or totals, then second moment estimates are often necessary too. Some examples of this are composite estimators where the component weights are functions of the component variances, generalized least squares estimators where an estimate of a covariance matrix is required, and the normal EM-algori thm where the sufficient statistics are functions of first and second moments. In summary, estimation of first moments is often intertwined with estimation of second moments. In cases where variances are not required in the first moment estimators, it may be necessary to estimate the variance of these first moment estimators, and this will require second moment estimates. The variances of second moment estimators are generally functions of population fourth moments and thus these second moment estimators can be very unstable. In some cases data relationships expressed by superpopulation models can impose restrictions on variance/covariance structure and suggest variance/covariance estimators, which themselves have relatively small variance. We consider a case where this information takes the form of a Markov superpopulation model which specifies that the expected current value of an item for a population unit is a function of the realized value of that item in the immediate past. For example, a manufacturer's expected output this year may be roughly proportional to his actual output last year. Such relationships can be expressed in terms of regression superpopulation models, and these models imply restrictions on the covariance matrix of the random variables that describe this longitudinal data. These restrictions can reduce the number of parameters and second moment terms that need to be estimated. This paper expands on two other papers by Woodruff (1989) and Johnson and Woodruff (1990). Both these papers apply one of the covariance matrix estimators analyzed here to generalized least squares estimation of finite population means and totals in the Bureau of Labor Statistics' (BLS) Current Employment Statistics (CES) survey. The CES survey is the Bureau's largest employment survey. It measures total national employment each month in about 1500 industry cells. Every month, the Bureau publishes estimates of total cell employment for past reference months based on all the CES survey data that is available for survey reference periods one, two, and three months in the past. Due to delayed reporting, this short time between reference date and initial publication date (one month) means that initial CES employment estimates may be based on relatively few sample units (often only about half) and as time passes and more data arrives, substantial revisions to these initial estimates are sometimes necessary. An estimator developed for the CES survey which depends on the model based covariance matrix estimator studied here can substantially reduce these revisions. To summarize the data flow, we can say that within one month of the reference date about half the units have responded, within two months this proportion is about three quarters, and within three months it is about nine tenths. All the sample data that is available for the m most current months for the n sample units in an estimation cell can be summarized in an nxm matrix, M, with missing entries, where an entry is missing if a given sample unit (row of M) did not have data for a given reference month (column of M). In the next section, a regression superpopulation model for M is described. This model is similar to the superpopulation models considered by Royall and Cumberland (1981a). An improved estimator of employment level in the CES survey, Johnson and Woodruff (1990), requires the variance/covariance matrix, 2 , defined in this model. This paper describes an estimator of 2 based on the regression superpopulation model (REM estimate) and compares it empirically with the estimator of ~ from the Normal EM-algori thm (NEM estimate). Little and Rubin (1987) give a clear and complete description of the Normal EM-algori thm (NEM). For more detail on the EM-algori thm see Beale and Little (1975) or Dempster and Laird (1977). In the simulation study, absolute error of the two estimators is compared. The REM estimate of 52 uses additional stochastic structure beyond the multivariate normality from which the NEM is derived. Thus, it is not surprising that the REM has smaller absolute error than the NEM for estimating ~ . However, the size of this reduction in absolute error is surprising. Although this application of a regression superpopulation model to derive the REM covariance matrix estimate may be of marginal interest in itself, this REM estimate is an important component of the GLS estimator (Link90) described in Johnson and Woodruff (1990). In addition, it is computationally far cheaper than the NEM estimate since it does not involve iterative recomputations, possibly several hundred, to convergence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structure of Wavelet Covariance Matrices and Bayesian Wavelet Estimation of Autoregressive Moving Average Model with Long Memory Parameter’s

In the process of exploring and recognizing of statistical communities, the analysis of data obtained from these communities is considered essential. One of appropriate methods for data analysis is the structural study of the function fitting by these data. Wavelet transformation is one of the most powerful tool in analysis of these functions and structure of wavelet coefficients are very impor...

متن کامل

Tuning of Extended Kalman Filter using Self-adaptive Differential Evolution Algorithm for Sensorless Permanent Magnet Synchronous Motor Drive

In this paper, a novel method based on a combination of Extended Kalman Filter (EKF) with Self-adaptive Differential Evolution (SaDE) algorithm to estimate rotor position, speed and machine states for a Permanent Magnet Synchronous Motor (PMSM) is proposed. In the proposed method, as a first step SaDE algorithm is used to tune the noise covariance matrices of state noise and measurement noise i...

متن کامل

Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values

Estimating the mean and the covariance matrix of an incomplete dataset and filling in missing values with imputed values is generally a nonlinear problem, which must be solved iteratively. The expectation maximization (EM) algorithm for Gaussian data, an iterative method both for the estimation of mean values and covariance matrices from incomplete datasets and for the imputation of missing val...

متن کامل

Joint State and Parameter Estimation of Squirrel Cage Induction Motor – A System Identification Approach using EM based Extended Kalman Filter

This paper deals with the recursive optimum estimation of rotor resistance, inductance and stator resistance of induction motor. The estimation of parameter and states in the presence of system noise is achieved using EKF, which takes in to account measurement and modelling inaccuracies. A major limitation in the parameter estimation using EKF is that its optimality is dependent on the choice o...

متن کامل

Maximum likelihood estimation of Gaussian mixture models using stochastic search

Gaussian mixture models (GMM), commonly used in pattern recognition and machine learning, provide a flexible probabilistic model for the data. The conventional expectation–maximization (EM) algorithm for the maximum likelihood estimation of the parameters of GMMs is very sensitive to initialization and easily gets trapped in local maxima. Stochastic search algorithms have been popular alternati...

متن کامل

Application of Model-Based Estimation to Time-Delay Estimation of Ultrasonic Testing Signals

Time-Delay-Estimation (TDE) has been a topic of interest in many applications in the past few decades. The emphasis of this work is on the application of model-based estimation (MBE) for TDE of ultrasonic signals used in ultrasonic thickness gaging. Ultrasonic thickness gaging is based on precise measurement of the time difference between successive echoes which reflect back from the back wall ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002